125 research outputs found

    Distributed Tree Kernels

    Get PDF
    In this paper, we propose the distributed tree kernels (DTK) as a novel method to reduce time and space complexity of tree kernels. Using a linear complexity algorithm to compute vectors for trees, we embed feature spaces of tree fragments in low-dimensional spaces where the kernel computation is directly done with dot product. We show that DTKs are faster, correlate with tree kernels, and obtain a statistically similar performance in two natural language processing tasks.Comment: ICML201

    Parsing with CYK over Distributed Representations

    Full text link
    Syntactic parsing is a key task in natural language processing. This task has been dominated by symbolic, grammar-based parsers. Neural networks, with their distributed representations, are challenging these methods. In this article we show that existing symbolic parsing algorithms can cross the border and be entirely formulated over distributed representations. To this end we introduce a version of the traditional Cocke-Younger-Kasami (CYK) algorithm, called D-CYK, which is entirely defined over distributed representations. Our D-CYK uses matrix multiplication on real number matrices of size independent of the length of the input string. These operations are compatible with traditional neural networks. Experiments show that our D-CYK approximates the original CYK algorithm. By showing that CYK can be entirely performed on distributed representations, we open the way to the definition of recurrent layers of CYK-informed neural networks.Comment: The algorithm has been greatly improved. Experiments have been redesigne

    Empowering Multi-step Reasoning across Languages via Tree-of-Thoughts

    Full text link
    Chain-of-Thought (CoT) prompting empowers the reasoning abilities of Large Language Models (LLMs), eliciting them to solve complex reasoning tasks step-by-step. However, with the success of CoT methods, the ability to deliver multi-step reasoning remains limited to English due to the imbalance in the distribution of the pre-training data, making the other languages a barrier. In this work, we propose a Cross-lingual multi-step reasoning approach, aiming to align reasoning processes across different languages. In particular, our method, through a Self-consistent Cross-lingual prompting mechanism inspired by the Tree-of-Thoughts approach, delivers multi-step reasoning paths in different languages that, during the steps, lead to the final solution. Our experimental evaluations show that our method significantly outperforms existing prompting methods, reducing the number of interactions and achieving state-of-the-art performance

    HANS, are you clever? Clever Hans Effect Analysis of Neural Systems

    Full text link
    Instruction-tuned Large Language Models (It-LLMs) have been exhibiting outstanding abilities to reason around cognitive states, intentions, and reactions of all people involved, letting humans guide and comprehend day-to-day social interactions effectively. In fact, several multiple-choice questions (MCQ) benchmarks have been proposed to construct solid assessments of the models' abilities. However, earlier works are demonstrating the presence of inherent "order bias" in It-LLMs, posing challenges to the appropriate evaluation. In this paper, we investigate It-LLMs' resilience abilities towards a series of probing tests using four MCQ benchmarks. Introducing adversarial examples, we show a significant performance gap, mainly when varying the order of the choices, which reveals a selection bias and brings into discussion reasoning abilities. Following a correlation between first positions and model choices due to positional bias, we hypothesized the presence of structural heuristics in the decision-making process of the It-LLMs, strengthened by including significant examples in few-shot scenarios. Finally, by using the Chain-of-Thought (CoT) technique, we elicit the model to reason and mitigate the bias by obtaining more robust models

    Distributed Smoothed Tree Kernel

    Get PDF
    In this paper we explore the possibility to merge the world of Compositional Distributional Semantic Models (CDSM) with Tree Kernels (TK). In particular, we will introduce a specific tree kernel (smoothed tree kernel, or STK) and then show that is possibile to approximate such kernel with the dot product of two vectors obtained compositionally from the sentences, creating in such a way a new CDSM

    Risk Assessment for Venous Thromboembolism in Chemotherapy-Treated Ambulatory Cancer Patients: A Machine Learning Approach

    Get PDF
    OBJECTIVE: To design a precision medicine approach aimed at exploiting significant patterns in data, in order to produce venous thromboembolism (VTE) risk predictors for cancer outpatients that might be of advantage over the currently recommended model (Khorana score). DESIGN: Multiple kernel learning (MKL) based on support vector machines and random optimization (RO) models were used to produce VTE risk predictors (referred to as machine learning [ML]-RO) yielding the best classification performance over a training (3-fold cross-validation) and testing set. RESULTS: Attributes of the patient data set ( n = 1179) were clustered into 9 groups according to clinical significance. Our analysis produced 6 ML-RO models in the training set, which yielded better likelihood ratios (LRs) than baseline models. Of interest, the most significant LRs were observed in 2 ML-RO approaches not including the Khorana score (ML-RO-2: positive likelihood ratio [+LR] = 1.68, negative likelihood ratio [-LR] = 0.24; ML-RO-3: +LR = 1.64, -LR = 0.37). The enhanced performance of ML-RO approaches over the Khorana score was further confirmed by the analysis of the areas under the Precision-Recall curve (AUCPR), and the approaches were superior in the ML-RO approaches (best performances: ML-RO-2: AUCPR = 0.212; ML-RO-3-K: AUCPR = 0.146) compared with the Khorana score (AUCPR = 0.096). Of interest, the best-fitting model was ML-RO-2, in which blood lipids and body mass index/performance status retained the strongest weights, with a weaker association with tumor site/stage and drugs. CONCLUSIONS: Although the monocentric validation of the presented predictors might represent a limitation, these results demonstrate that a model based on MKL and RO may represent a novel methodological approach to derive VTE risk classifiers. Moreover, this study highlights the advantages of optimizing the relative importance of groups of clinical attributes in the selection of VTE risk predictors

    A Trip Towards Fairness: Bias and De-Biasing in Large Language Models

    Full text link
    An outbreak in the popularity of transformer-based Language Models (such as GPT (Brown et al., 2020) and PaLM (Chowdhery et al., 2022)) has opened the doors to new Machine Learning applications. In particular, in Natural Language Processing and how pre-training from large text, corpora is essential in achieving remarkable results in downstream tasks. However, these Language Models seem to have inherent biases toward certain demographics reflected in their training data. While research has attempted to mitigate this problem, existing methods either fail to remove bias altogether, degrade performance, or are expensive. This paper examines the bias produced by promising Language Models when varying parameters and pre-training data. Finally, we propose a de-biasing technique that produces robust de-bias models that maintain performance on downstream tasks

    Senso Comune as a Knowledge Base of Italian language: The Resource and its Development

    Get PDF
    International audienceSenso Comune is a linguistic knowledge base for the Italian Language, which accommodates the content of a legacy dictionary in a rich formal model. The model is implemented in a platform which allows a community of contributors to enrich the resource. We provide here an overview of the main project features, including the lexical-ontology model, the process of sense classification, and the annotation of meaning definitions (glosses) and lexicographic examples. Also, we will illustrate the latest work of alignment with MultiWordNet, to illustrate the methodologies that have been experimented with, to share some preliminary result, and to highlight some remarkable findings about the semantic coverage of the two resources
    • …
    corecore